Learn2Mine: An Open-Source Cloud-Based Informatics Platform For Integrated Teaching and Data Exploration
نویسندگان
چکیده
Informatics platforms exist for a multitude of purposes and specialties (e.g., Taverna, Galaxy, RapidMiner). The development of such platforms is vital when considering the expanding role of Big Data. In concordance with the trend of programs reaching toward a open-source, cloud-based environment, we are developing an informatics and data mining platform that integrates teaching and data exploration through the development of gameful experiences. Learn2Mine is built upon the widely used open-source project Weka, unlike other informatics platforms, Learn2Mine utilizes a decompositionbased flow rather than a connected GUI workflow like that of RapidMiner, Galaxy, and Taverna. With this interface, focus is shifted towards discovery by offering levels of abstraction. The system is designed to build a user’s confidence in data science techniques intuitively and effectively by completing the feedback loop between application and domain expert. This develops a positive feedback mechanism through the integrated teaching environment, as students will be learning complex data mining techniques while earning achievements and being directed toward optimal solutions. Scientists are able to disable this feedback when a pure data analysis and exploration system is desired. Learn2Mine’s core data storage environment, built upon Google Drive, provides significant utility to multidisciplinary teams. Datasets, analyses, and customized algorithms will be able to be shared between collaborating scientists and with the data science community. The customization of templates allows for scientists who have a strong specialization in a field, such as bioinformatics, to develop a statistical tool that is much more meaningful to them rather than a geoinformatician. The system’s initial implementation and associated feedback integration is tested on its ability to analyze standard pedagogical datasets. The utility of this software will also be demonstrated through ongoing collaborations with the Air Force Research Laboratory, where datasets of human fatigue and toxicology will be analyzed.
منابع مشابه
Informatics and Data Mining Tools and Strategies for the Human Connectome Project
The Human Connectome Project (HCP) is a major endeavor that will acquire and analyze connectivity data plus other neuroimaging, behavioral, and genetic data from 1,200 healthy adults. It will serve as a key resource for the neuroscience research community, enabling discoveries of how the brain is wired and how it functions in different individuals. To fulfill its potential, the HCP consortium i...
متن کاملCloud Computing; A New Approach to Learning and Learning
Introduction: The cloud computing and services, as a technological solution for developing educational services, can accelerate the provision and expansion of these highly useful services. This study intended to provide an overall picture of practical areas of learning services based on cloud computing teaching and learning equipment. Methods: This was a theoretical hybrid research study in whi...
متن کاملOpen Cloud-Based PaaS Architecture for Service-Oriented Mobile Robots
Open-source software frameworks such as Apache Hadoop and Robot Operating System (ROS) are helped researchers to reduce arduous engineering work releasing them to concentrate more on the core research. The impact of extensive knowledge repositories of robot operating system interconnected with a large distributed data storage framework entails enormous developmental step in the future robotic s...
متن کاملtranSMART: An Open Source and Community-Driven Informatics and Data Sharing Platform for Clinical and Translational Research
tranSMART is an emerging global open source public private partnership community developing a comprehensive informatics-based analysis and data-sharing cloud platform for clinical and translational research. The tranSMART consortium includes pharmaceutical and other companies, not-for-profits, academic entities, patient advocacy groups, and government stakeholders. The tranSMART value propositi...
متن کاملCloud Computing for Fundamental Spatial Operations on Polygonal GIS Data
Efficient end-to-end parallel/distributed processing of polygon-based spatial data (also known as vector-based data) has been a long-standing research question in GIS community. The irregular and data intensive nature of the underlying computation has impeded the exploratory research in this space. We have created an open architecture based system named Crayons for Azure cloud platform using st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012